Using MAP estimated parameters to improve HMM speech recognition performance

نویسندگان

Yoshihiko Gotoh

Mike Hochberg

Harvey F. Silverman

چکیده

RECOGNITION PERFORMANCE Yoshihiko Gotoh 1 Michael M. Hochberg 2 Harvey F. Silverman 1 1 LEMS, Division of Engineering, Brown University, Providence, RI 02912 USA 2 Cambridge University Engineering Department, Trumpington Street, Cambridge CB2 1PZ UK ABSTRACT Hidden Markov models (HMMs) have been quite successfully applied to speech recognition tasks, but many unsolved problems still remain. HMMs do not directly model all phenomena that might be useful for recognition. This is the case, for example, for duration modeling. Mechanisms are needed to incorporate additional information into an HMM system. This paper presents a maximum a posteriori (MAP) parameter estimation approach for improving the state-duration modeling capability and incorporating a priori knowledge about the word-duration distribution into an HMM. The MAP-based approach is evaluated on a talker-independent, connected alphadigit task for various prior distributions on duration. The results | in terms of both computational complexity and recognition performance | are compared with the results of HMM-based systems trained with the traditional maximum-likelihood criterion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

متن کامل

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Bayesian affine transformation of HMM parameters for instantaneous and supervised adaptation in telephone speech recognition

This paper proposes a Bayesian affine transformation of hidden Markov model (HMM) parameters for reducing the acoustic mismatch problem in telephone speech recognition. Our purpose is to transform the existing HMM parameters into its new version of specific telephone environment using affine function so as to improve the recognition rate. The maximum a posteriori (MAP) estimation which merges t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1994

Using MAP estimated parameters to improve HMM speech recognition performance

نویسندگان

چکیده

منابع مشابه

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Bayesian affine transformation of HMM parameters for instantaneous and supervised adaptation in telephone speech recognition

عنوان ژورنال:

اشتراک گذاری